Current Issue : October - December Volume : 2012 Issue Number : 4 Articles : 6 Articles
An FPGA-based Linux test-bed was constructed for the purpose of measuring its sensitivity to single-event upsets. The test-bed\r\nconsists of two ML410 Xilinx development boards connected using a 124-pin custom connector board. The Design Under Test\r\n(DUT) consists of the ââ?¬Å?hard coreââ?¬Â PowerPC, running the Linux OS and several peripherals implemented in ââ?¬Å?softââ?¬Â (programmable)\r\nlogic. Faults were injected via the Internal Configuration Access Port (ICAP). The experiments performed here demonstrate that\r\nthe Linux-based system was sensitive to 199,584 or about 1.4 percent of all tested bits. Each sensitive bit in the bit-stream is\r\nmapped to the resource and user-module to which it configures. A density metric for comparing the reliability of modules within\r\nthe system is presented. Using this density metric, we found that the most sensitive user module in the design was the PowerPCââ?¬â?¢s\r\ndirect connections to the DDR2 memory controller....
Mathematical morphology supplies powerful tools for low-level image analysis. Many applications in computer vision require\r\ndedicated hardware for real-time execution. The design of morphological operators for a given application is not a trivial one.\r\nGenetic programming is a branch of evolutionary computing, and it is consolidating as a promising method for applications\r\nof digital image processing. The main objective of genetic programming is to discover how computers can learn to solve\r\nproblems without being programmed for that. In this paper, the development of an original reconfigurable architecture using\r\nlogical, arithmetic, and morphological instructions generated automatically by a genetic programming approach is presented.\r\nThe developed architecture is based on FPGAs and has among the possible applications, automatic image filtering, pattern\r\nrecognition and emulation of unknown filter. Binary, gray, and color image practical applications using the developed architecture\r\nare presented and the results are compared with similar techniques found in the literature....
Parallel graph-oriented applications expressed in the Bulk-Synchronous Parallel (BSP) and Token Dataflow compute models\r\ngenerate highly-structured communication workloads from messages propagating along graph edges. We can statially expose\r\nthis structure to traffic compilers and optimization tools to reshape and reduce traffic for higher performance (or lower area,\r\nlower energy, lower cost). Such offline traffic optimization eliminates the need for complex, runtime NoC hardware and enables\r\nlightweight, scalable NoCs. We perform load balancing, placement, fanout routing, and fine-grained synchronization to optimize\r\nour workloads for large networks up to 2025 parallel elements for BSP model and 25 parallel elements for Token Dataflow.\r\nThis allows us to demonstrate speedups between 1.2Ã?â?? and 22Ã?â?? (3.5Ã?â?? mean), area reductions (number of Processing Elements)\r\nbetween 3Ã?â?? and 15Ã?â?? (9Ã?â?? mean) and dynamic energy savings between 2Ã?â?? and 3.5Ã?â?? (2.7Ã?â?? mean) over a range of real-world graph\r\napplications in the BSP compute model. We deliver speedups of 0.5ââ?¬â??13Ã?â?? (geomean 3.6Ã?â??) for Sparse Direct Matrix Solve (Token\r\nDataflow compute model) applied to a range of sparse matrices when using a high-quality placement algorithm. We expect such\r\ntraffic optimization tools and techniques to become an essential part of the NoC application-mapping flow....
Massively parallel reconfigurable architectures, which offer massive parallelism coupled with the capability of undergoing\r\nrun-time reconfiguration, are gaining attention in order to meet the increased computational demands of high-performance\r\nembedded systems. We propose that the occam-pi language is used for programming of the category of massively parallel\r\nreconfigurable architectures. The salient properties of the occam-pi language are explicit concurrency with built-in mechanisms\r\nfor interprocessor communication, provision for expressing dynamic parallelism, support for the expression of dynamic\r\nreconfigurations, and placement attributes. To evaluate the programming approach, a compiler framework was extended to\r\nsupport the language extensions in the occam-pi language and a backend was developed to target the Ambric array of processors.\r\nWe present two case-studies; DCT implementation exploiting the reconfigurability feature of occam-pi and a significantly\r\nlarge autofocus criterion calculation based on the dynamic parallelism capability of the occam-pi language. The results of\r\nthe implemented case studies suggest that the occam-pi-language-based approach simplifies the development of applications\r\nemploying run-time reconfigurable devices without compromising the performance benefits....
Growing ubiquity and safety relevance of embedded systems strengthen the need to protect their functionality against malicious\r\nattacks. Communication and system authentication by digital signature schemes is a major issue in securing such systems.\r\nThis contribution presents a complete ECDSA signature processing system over prime fields for bit lengths of up to 256 on\r\nreconfigurable hardware. By using dedicated hardware implementation, the performance can be improved by up to two orders\r\nof magnitude compared to microcontroller implementations. The flexible system is tailored to serve as an autonomous subsystem\r\nproviding authentication transparent for any application. Integration into a vehicle-to-vehicle communication system is shown as\r\nan application example....
In this paper VHDL implementation of 8-bit arithmetic logic unit (ALU) is presented. The design was implemented using VHDL Xilinx Synthesis tool ISE 13.1 and targeted for Spartan device. ALU was designed to perform arithmetic operations such as addition and subtraction using 8-bit fast adder, logical operations such as AND, OR, XOR and NOT operations, 1’s and 2’s complement operations and compare. ALU consist of two input registers to hold the data during operation, one output register to hold the result of operation, 8-bit fast adder with 2’s complement circuit to perform subtraction and logic gates to perform logical operation. The maximum propagation delay is 13.588ns and power dissipation is 38mW. The ALU was designed for controller used in network interface card....
Loading....